# 128k long context
Kernelllm GGUF
Other
KernelLLM is a model fine-tuned based on Llama 3.1 Instruct, specifically designed for writing GPU kernels using Triton.
Large Language Model
K
lmstudio-community
214
1
Xgen Small 9B Instruct R
xGen-small is an enterprise-grade compact language model that achieves long-context performance with predictable low costs through domain-focused data curation, scalable pre-training, length extension, and reinforcement learning fine-tuning.
Large Language Model
Transformers English

X
Salesforce
97
4
Qwen2.5 VL 72B Instruct GGUF
Other
A multimodal large model launched by Tongyi Qianwen, supporting image and text generation and 128k long context processing, with multilingual capabilities.
Image-to-Text English
Q
lmstudio-community
668
1
Xlam 2 1b Fc R
xLAM-2 is a series of large action models developed by Salesforce, focusing on multi-turn dialogue and function calling capabilities, serving as the core component of AI agents.
Large Language Model
Transformers English

X
Salesforce
63
2
Llama 3.1 405B Instruct
Llama 3.1 is a multilingual large language model series developed by Meta, including 8B, 70B, and 405B scales, supporting multilingual text generation and code generation tasks.
Large Language Model
Transformers Supports Multiple Languages

L
meta-llama
34.83k
569
Saanvi C0 12B
Apache-2.0
A 12-billion-parameter large language model optimized for speed, efficiency, and contextual accuracy, supporting RAG-enhanced technology and a 128k context window.
Large Language Model
Transformers

S
riple-saanvi-lab
170
2
Xlam 2 32b Fc R
xLAM-2 is Salesforce's next-generation large action model, focusing on multi-turn dialogue and function calling capabilities, capable of translating user intent into executable actions, serving as the core component of AI agents.
Large Language Model
Transformers English

X
Salesforce
319
4
Llama Xlam 2 8b Fc R
The xLAM-2 series is a large-scale action model trained on the APIGen-MT framework, focusing on multi-round conversations and function calling capabilities, suitable for AI agent development.
Large Language Model
Transformers English

L
Salesforce
778
8
Llama Xlam 2 70b Fc R
xLAM-2 is a Large Action Model (LAM) series developed by SalesforceAIResearch, focusing on transforming user intent into executable actions to enhance AI agent decision-making capabilities.
Large Language Model
Transformers English

L
Salesforce
420
10
Gemma 3 Nine Rings Of Power Fiction Horror 4b It GGUF
Apache-2.0
Based on the Google Gemma-3 model, fine-tuned through 9 Neo and horror Imatrix methods, focusing on the generation of horror and fictional content
Large Language Model English
G
DavidAU
6,418
1
Llama SEA LION V3 8B IT
SEA-LION is a series of large language models pre-trained and instruction fine-tuned for the Southeast Asian region, dedicated to solving multilingual processing problems in this region and providing strong support for natural language processing of Southeast Asian languages.
Large Language Model
Transformers Supports Multiple Languages

L
aisingapore
3,954
7
Llama 3.2 3B Instruct
Llama 3.2 is a collection of multilingual large language models launched by Meta, including pre-trained and instruction-tuned generative models of 1B and 3B sizes. It is optimized for multilingual dialogue use cases and performs well in common industry benchmark tests.
Large Language Model
Transformers Supports Multiple Languages

L
alpindale
1,691
8
Llama 3.2 3B Instruct
Llama 3.2 is a multilingual large language model series developed by Meta, including 1B and 3B scale pre-trained and instruction-tuned generative models, optimized for multilingual conversation scenarios.
Large Language Model
Transformers Supports Multiple Languages

L
meta-llama
1.6M
1,391
Phi 3.5 Mini ITA
MIT
A version fine-tuned from Microsoft/Phi-3.5-mini-instruct, optimized for Italian language performance—a small yet powerful language model
Large Language Model
Transformers Supports Multiple Languages

P
anakin87
8,495
13
Omost Phi 3 Mini 128k 8bits
Omost's phi-3-mini model with 128k context length, utilizing fp8 precision.
Large Language Model
Transformers

O
lllyasviel
47
7
Yarn Mistral 7B 128k AWQ
Apache-2.0
Yarn Mistral 7B 128K is an advanced language model optimized for long-context processing, further pre-trained on long-context data using the YaRN extension method, supporting a 128k token context window.
Large Language Model
Transformers English

Y
TheBloke
483
72
Featured Recommended AI Models